Conference Proceedings

Statistical compression of protein folding patterns for inference of recurrent substructural themes

Ramanan Subramanian, Lloyd Allison, Peter J Stuckey, Maria Garcia de la Banda, David Abramson, Arthur M Lesk, Arun S Konagurthu, A Bilgin (ed.), MW Marcellin (ed.), J SerraSagrista (ed.), JA Storer (ed.)

Data Compression Conference Proceedings | IEEE COMPUTER SOC | Published : 2017

Abstract

Computational analyses of the growing corpus of three-dimensional (3D) structures of proteins have revealed a limited set of recurrent substructural themes, termed super-secondary structures. Knowledge of super-secondary structures is important for the study of protein evolution and for the modeling of proteins with unknown structures. Characterizing a comprehensive dictionary of these super-secondary structures has been an unanswered computational challenge in protein structural studies. This paper presents an unsupervised method for learning such a comprehensive dictionary using the statistical framework of lossless compression on a database comprised of concise geometric representations o..

View full abstract

University of Melbourne Researchers